Goto

Collaborating Authors

 Santiago Province


Combining Observational Data and Language for Species Range Estimation Max Hamilton 1 Christian Lange 2 Elijah Cole 3 Alexander Shepard 4

Neural Information Processing Systems

Species range maps (SRMs) are essential tools for research and policy-making in ecology, conservation, and environmental management. However, traditional SRMs rely on the availability of environmental covariates and high-quality species location observation data, both of which can be challenging to obtain due to geographic inaccessibility and resource constraints. We propose a novel approach combining millions of citizen science species observations with textual descriptions from Wikipedia, covering habitat preferences and range descriptions for tens of thousands of species.


Minimax Forward and Backward Learning of Evolving Tasks with Performance Guarantees Santiago Mazuelas

Neural Information Processing Systems

For a sequence of classification tasks that arrive over time, it is common that tasks are evolving in the sense that consecutive tasks often have a higher similarity. The incremental learning of a growing sequence of tasks holds promise to enable accurate classification even with few samples per task by leveraging information from all the tasks in the sequence (forward and backward learning). However, existing techniques developed for continual learning and concept drift adaptation are either designed for tasks with time-independent similarities or only aim to learn the last task in the sequence. This paper presents incremental minimax risk classifiers (IMRCs) that effectively exploit forward and backward learning and account for evolving tasks. In addition, we analytically characterize the performance improvement provided by forward and backward learning in terms of the tasks' expected quadratic change and the number of tasks. The experimental evaluation shows that IMRCs can result in a significant performance improvement, especially for reduced sample sizes.


Graph Reordering for Cache-Efficient Near Neighbor Search Santiago Segarra ECE Department Rice University Houston, TX77005

Neural Information Processing Systems

Graph search is one of the most successful algorithmic trends in near neighbor search. Several of the most popular and empirically successful algorithms are, at their core, a greedy walk along a pruned near neighbor graph. However, graph traversal applications often suffer from poor memory access patterns, and near neighbor search is no exception to this rule. Our measurements show that popular search indices such as the hierarchical navigable small-world graph (HNSW) can have poor cache miss performance. To address this issue, we formulate the graph traversal problem as a cache hit maximization task and propose multiple graph reordering as a solution. Graph reordering is a memory layout optimization that groups commonly-accessed nodes together in memory.


Variable-rate hierarchical CPC leads to acoustic unit discovery in speech Santiago Cuervo 1,2 Adrian Łańcucki

Neural Information Processing Systems

The success of deep learning comes from its ability to capture the hierarchical structure of data by learning high-level representations defined in terms of low-level ones. In this paper we explore self-supervised learning of hierarchical representations of speech by applying multiple levels of Contrastive Predictive Coding (CPC). We observe that simply stacking two CPC models does not yield significant improvements over single-level architectures. Inspired by the fact that speech is often described as a sequence of discrete units unevenly distributed in time, we propose a model in which the output of a low-level CPC module is non-uniformly downsampled to directly minimize the loss of a high-level CPC module. The latter is designed to also enforce a prior of separability and discreteness in its representations by enforcing dissimilarity of successive high-level representations through focused negative sampling, and by quantization of the prediction targets. Accounting for the structure of the speech signal improves upon single-level CPC features and enhances the disentanglement of the learned representations, as measured by downstream speech recognition tasks, while resulting in a meaningful segmentation of the signal that closely resembles phone boundaries.


Multi-task Online Learning for Probabilistic Load Forecasting

arXiv.org Machine Learning

Load forecasting is essential for the efficient, reliable, and cost-effective management of power systems. Load forecasting performance can be improved by learning the similarities among multiple entities (e.g., regions, buildings). Techniques based on multi-task learning obtain predictions by leveraging consumption patterns from the historical load demand of multiple entities and their relationships. However, existing techniques cannot effectively assess inherent uncertainties in load demand or account for dynamic changes in consumption patterns. This paper proposes a multi-task learning technique for online and probabilistic load forecasting. This technique provides accurate probabilistic predictions for the loads of multiple entities by leveraging their dynamic similarities. The method's performance is evaluated using datasets that register the load demand of multiple entities and contain diverse and dynamic consumption patterns. The experimental results show that the proposed method can significantly enhance the effectiveness of current multi-task learning approaches across a wide variety of load consumption scenarios.


Auditing Fairness by Betting Ben Chugg 1, Santiago Cortes-Gomez 1, Bryan Wilder

Neural Information Processing Systems

We provide practical, efficient, and nonparametric methods for auditing the fairness of deployed classification and regression models. Whereas previous work relies on a fixed-sample size, our methods are sequential and allow for the continuous monitoring of incoming data, making them highly amenable to tracking the fairness of real-world systems. We also allow the data to be collected by a probabilistic policy as opposed to sampled uniformly from the population. This enables auditing to be conducted on data gathered for another purpose. Moreover, this policy may change over time and different policies may be used on different subpopulations. Finally, our methods can handle distribution shift resulting from either changes to the model or changes in the underlying population. Our approach is based on recent progress in anytime-valid inference and game-theoretic statistics--the "testing by betting" framework in particular. These connections ensure that our methods are interpretable, fast, and easy to implement. We demonstrate the efficacy of our approach on three benchmark fairness datasets.


Congratulations to the #ECAI2024 outstanding paper award winners

AIHub

The 27th European Conference on Artificial Intelligence (ECAI-2024) took place from 19-24 October in Santiago de Compostela, Spain. The venue also played host to the 13th Conference on Prestigious Applications of Intelligent Systems (PAIS-2024). During the week, both conferences announced their outstanding paper award winners. The winning articles were chosen based on the reviews written during the paper selection process, nominations submitted by individual members of the programme committee, additional input solicited from outside experts, and the judgement of the programme committee chairs. Abstract: Proper losses such as cross-entropy incentivize classifiers to produce class probabilities that are well-calibrated on the training data.


How Gears of War's Mad World trailer changed video game marketing forever

The Guardian

At the Xbox Games Showcase this June, Microsoft debuted a trailer for the eighth game in the violent, grandiose and unexpectedly maudlin Gears of War series: a prequel. The sight of series heroes Marcus Fenix and Dom Santiago as younger men is "an emotional homecoming like no other", as Microsoft's Xbox blog put it. But the real tug at the heartstrings comes with the first notes of a slow, instrumental rendition of Tears for Fears' Mad World. "As a 41-year-old man, that piano got me tearing up," wrote one YouTube commenter. It's a throwback to the original, iconic Gears of War trailer from 2006, in which a lonesome Fenix picks through his ruined world to Gary Jules' plaintive cover of the same song.


Minimax Forward and Backward Learning of Evolving Tasks with Performance Guarantees Santiago Mazuelas

Neural Information Processing Systems

For a sequence of classification tasks that arrive over time, it is common that tasks are evolving in the sense that consecutive tasks often have a higher similarity. The incremental learning of a growing sequence of tasks holds promise to enable accurate classification even with few samples per task by leveraging information from all the tasks in the sequence (forward and backward learning). However, existing techniques developed for continual learning and concept drift adaptation are either designed for tasks with time-independent similarities or only aim to learn the last task in the sequence. This paper presents incremental minimax risk classifiers (IMRCs) that effectively exploit forward and backward learning and account for evolving tasks. In addition, we analytically characterize the performance improvement provided by forward and backward learning in terms of the tasks' expected quadratic change and the number of tasks. The experimental evaluation shows that IMRCs can result in a significant performance improvement, especially for reduced sample sizes.


Scalable Property Valuation Models via Graph-based Deep Learning

arXiv.org Artificial Intelligence

This paper aims to enrich the capabilities of existing deep learning-based automated valuation models through an efficient graph representation of peer dependencies, thus capturing intricate spatial relationships. In particular, we develop two novel graph neural network models that effectively identify sequences of neighboring houses with similar features, employing different message passing algorithms. The first strategy consider standard spatial graph convolutions, while the second one utilizes transformer graph convolutions. This approach confers scalability to the modeling process. The experimental evaluation is conducted using a proprietary dataset comprising approximately 200,000 houses located in Santiago, Chile. We show that employing tailored graph neural networks significantly improves the accuracy of house price prediction, especially when utilizing transformer convolutional message passing layers.